An architecture for high instruction level parallelism

نویسندگان

  • Siamak Arya
  • Howard Sachs
  • Sreeram Duvvuru
چکیده

High instruction level parallelism (ILP) can only be achieved when data $0~ and control flow constraints have been removed or reduced. Data jlow constraints, not inherent in the original code, arise from lack of sufJicient resources for initiation and execution of multiple instructions concurrently. Control flow problems are caused by branches which force unpredictable changes in the sequential order of code execution. Removing these obstacles allows for the formation of larger basic blocks, resulting in higher ILP. The data flow problems are reduced by increasing the number offunctional units, registers, condition bits, by pipelining the functional units, and using nonblocking caches. The control pow problem is reduced by using techniques such as conditional execution, speculative execution, and software pipelining, leveraging hardware support. Thus, for high ILP, the processor architecture should include a very closely tied hardware and compiler architectures. An architecture that supports the above features, Software Scheduled SuperScalar, is presented in this paper.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scalable and Flexible heterogeneous multi-core system

Multi-core system has wide utility in today’s applications due to less power consumption and high performance. Many researchers are aiming at improving the performance of these systems by providing flexible multi-core architecture. Flexibility in the multi-core processors system provides high throughput for uniform parallel applications as well as high performance for more general work. This fl...

متن کامل

MT-ADRES: Multithreading on Coarse-Grained Reconfigurable Architecture

The coarse-grained reconfigurable architecture ADRES (Architecture for Dynamically Reconfigurable Embedded Systems) and its compiler offer high instruction-level parallelism (ILP) to applications by means of a sparsely interconnected array of functional units and register files. As high-ILP architectures achieve only low parallelism when executing partially sequential code segments, which is al...

متن کامل

JMA: The Java-Multithreading Architecture for Embedded Processors

Embedded processors are increasingly deployed in applications requiring high performance with good real-time characteristics whilst being low power. Parallelism has to be extracted in order to improve the performance at an architectural level. Extracting instruction level parallelism requires extensive speculation which adds complexity and increases power consumption. Alternatively, parallelism...

متن کامل

Jetpipeline: a Hybrid Pipeline Architecture for Instruction-level Parallelism

High performance processors based on pipeline processing play an important role in scientific and engineering computation. However, it is difficult to gain a satisfactory solution when taking both high degree of flexibility of parallel processing and low hardware complexity into account. This paper propose a hybrid pipeline architecture named Jetpipeline that possesses high degree of flexibilit...

متن کامل

An Instruction Cache Architecture for Parallel Execution of Java Threads

Designing a Java processor supporting horizontal multithreading has been becoming more attractive as network computing gains importance. Different from the traditional superscalar processors that issue multiple instructions from a single instruction stream to exploit the instruction level parallelism (ILP), the horizontal multithreading Java processors issue multiple instructions (bytecodes) fr...

متن کامل

An Architecture Framework for IntroducingPredicated Execution into

Growing demand for high performance in embedded systems is creating new opportunities for Instruction-Level Parallelism (ILP) techniques that are traditionally used in high performance systems. Pred-icated execution, an important ILP technique, can be used to improve branch handling, reduce frequently mispredicted branches, and expose multiple execution paths to hardware resources. However, the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995